Modeling Measurement Facets and Assessing Generalizability in a Large-Scale Writing Assessment

نویسندگان

  • Xiaohong Gao
  • Robert L. Brennan
چکیده

Measurement error and reliability are two important psychometric properties for large-scale assessments. Generalizability theory has often been used to identify sources of error and to estimate score reliability. The complicated nature of sparse matrix data collection designs in some assessments, however, can cause challenges in conducting generalizability analyses. The present study examines potential sources of measurement error associated with large-scale writing assessment scores by modeling multiple measurement components and conducting multistep analyses based on both univariate and multivariate generalizability theory. The study demonstrates how to use multiple generalizability analyses to produce approximate estimates of measurement error and reliability under complex measurement conditions when a single study design cannot capture and disentangle all measurement facets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Score Generalizability of Writing Assessment: the Effect of Rater’s Gender

The score reliability of language performance tests has attracted increasing interest. Classical Test Theory cannot examine multiple sources of measurement error. Generalizability theory extends Classical Test Theory to provide a practical framework to identify and estimate multiple factors contributing to the total variance of measurement. Generalizability theory by using analysis of variance ...

متن کامل

Rater Errors among Peer-Assessors: Applying the Many-Facet Rasch Measurement Model

In this study, the researcher used the many-facet Rasch measurement model (MFRM) to detect two pervasive rater errors among peer-assessors rating EFL essays. The researcher also compared the ratings of peer-assessors to those of teacher assessors to gain a clearer understanding of the ratings of peer-assessors. To that end, the researcher used a fully crossed design in which all peer-assessors ...

متن کامل

Developing Rating Scale Descriptors for Assessing the Stages of Writing Process: The Constructs Underlying Students' Writing Performances

The purpose of the present study is to develop appropriate scoring scales for each of the defined stages of the writing process, and also to determine to what extent these scoring scales can reliably and validly assess the performances of EFL learners in an academic writing task. Two hundred and two students’ writing samples were collected after a step-by-step process oriented essay writing ins...

متن کامل

Medical education assessment: a brief overview of concepts in generalizability theory

article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use of work provided the original work is properly cited. General Medical Council (GMC) in the UK has emphasized the importance of internal consistency for students' assessment scores in medical education. 1 Typically Cronbach's alpha is reported by medical educators as an index of interna...

متن کامل

The Impact of Rating Methods and Task Types on EFL Learners' Writing Scores

The difficulty of assessing the writing skill is well known. Different testing facets seem to affect the result of assessing the writing skill. In addition to the writer’s ability, the topic of the writing task, and methods of rating may contribute to the writer’s score. In this study, 50 EFL learners wrote four different types of writing tasks (convincing, describing, instructing, and explaini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015